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(54) Title: NOVEL HUMAN TUMOUR SUPPRESSOR GENE 



(57) Abstract 

A novel human progestin-regulated 
gene designated EDD (E3 isolated by 
Differential Display) is disclosed which 
encodes a product exhibiting significant 
amino acid sequence identity with the HYD 
protein (hyperplastic discs) from Drosophila 
melanogaster and the 100 kDa HECT 
(homologous to E6-AP carboxyl terminus) 
domain protein from rat. The EDD gene 
appears to represent a tumour suppressor 
gene and the detection of a polymorphism 
or alteration in the gene from a subject may 
be useful for the diagnosis or determination 
of a predisposition to hyperproliferative 
disease such as a cancer. An assay for 
assessing progestin-responsiveness in a 
subject is also disclosed. 
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NOVEL HUMAN TUMOUR SUPPRESSOR GENE 

Field of the Invention : 

This invention relates to a novel human progestin-regulated gene 
5 designated EDD (E3 isolated by Differential Display) which encodes a 

product exhibiting significant amino acid sequence identity with the HYD 
protein {hyperplastic discs) from Drosophila melanogaster and the 100 kDa 
HECT (homologous to E6-AP carboxyl terminus) domain protein from rat. 

10 Background to the Invention : 

The control of cell proliferation and differentiation in the normal 
breast and in breast cancer involves complex actions and interactions of 
steroid hormones (in particular estrogen and progesterone), peptide 
hormones and growth factors (1, 2). How these agents act at critical control 

15 points within the cell cycle to influence progression through the cycle or 
exit to enter a pathway of differentiation is only partially understood (3-5). 

Progestins are responsible for mammary gland lobuloalveolar 
development during pregnancy (6), although there is evidence for a more 
predominant role for estrogens than progestins in stimulating epithelial cell 

20 proliferation in the normal premenopausal breast (7, 8). Progestins both 

stimulate and inhibit breast cancer epithelial cell proliferation in vitro but the 
predominant effect is growth inhibition probably via induction of 
differentiation (3, 4, 7, 9). Progestin action is mediated primarily through the 
progesterone receptor (PR), which acts as a transcriptional transactivator for a 

25 largely undefined set of progestin-responsive genes which may, in turn, 

transcriptionally or post-transcriptionally influence additional genes or gene 
products. 

Only a limited number of genes have been implicated in progestin 
action on cell proliferation. Previous studies by the present inventors have 

30 identified c-myc and cyclin Dl as major downstream targets of progestin- 
stimulated cell cycle progression in human breast cancer cells (3, 10) while 
the delayed growth inhibitory effects of progestins involve decreases in 
cyclin Dl and E gene expression (4, 9). While progestin effects on c-myc 
gene expression are rapid and occur within minutes, effects on cyclin 

35 expression begin several hours later, pointing to the presence of undefined 
earlier events. 
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Since progestin action is complex and is likely to involve multiple genes, 
many of which are currently unknown, the differential display RT-PCR 
technique (DD-PCR) (11) was adopted to identify target genes in cultured 
human breast cancer cells. The utility of this approach has been previously 
5 demonstrated by the cloning of PRGl, a gene having significant homology with 
isoforms of 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase (12). Using 
the same technique, a novel progestin-regulated gene, EDD (designated DD5 in 
the applicant's Australian Provisonal patent application No. P06334), has been 
identified. 

10 Based on amino acid sequence similarity, EDD appears to be a human 

homologue of the Drosophila tumor suppressor gene hyperplastic discs {hyd] 
(13). Although the function of the HYD protein is unknown, significant 
homology exists between its carboxyl terminus and those of human E6-AP and a 
number of proteins identified through database searches (14). These HECT 

15 domain family proteins function as ubiquitin-protein ligases (E3 enzymes) (14- 
16), playing a role in the ubiquitination cascade that targets specific substrate 
proteins for proteolysis. Notably, the protein encoded by EDD has a carboxy- 
terminal HECT domain containing a cysteine residue that covalently binds 
ubiquitin. This amino acid is conserved in all known HECT domain-containing 

20 E3 enzymes and is involved in the transfer of ubiquitin. It is therefore proposed 
that the EDD gene represents a novel human tumour suppressor gene encoding 
a ubiquitin-protein ligase. 

Disclosure of the Invention : 

25 In a first aspect, the present invention provides an isolated 

polynucleotide molecule comprising a nucleotide sequence encoding a 

protein which comprises the following N-terminal amino acid sequence: 
MTSIHFWHP 

or a biologically active portion of said protein. 
30 Preferably, the encoded protein comprises the following N-terminal 

amino acid sequence: 

MTSIHFVVHPIJ^GTEDQIJSIDRKREV 

More preferably, the encoded protein is ubiquitin-protein ligase and 

has an approximate molecular weight of 300kDa. 
35 Most preferably, the isolated polynucleotide molecule comprises a 

nucleotide sequence substantially corresponding to or, at least, > 90% (more 
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preferably, > 95%) homologous to the nucleotide sequence shown at Figure 
3B from nucleotide 34 to nucleotide 8424 or a portion(s) thereof. 

The term "portion(s) thereof in this regard is to be understood as 
referring to portion(s) of the nucleotide sequence which encode biologically 
5 active peptide or polypeptide portions or antigenic determinants. Typically, 
such "portions(s) thereof will comprise a nucleotide sequence of at least 50 
nucleotides in length. However, shorter portions of the nucleotide sequence 
(e.g. portions of £ 8 nucleotides in length) may also be used in or for the 
production of probes useful for hybridization assays. 

10 Thus, in a second aspect, the present invention provides an 

oligonucleotide or polynucleotide probe molecule labelled with a suitably 
detectable label (e.g. radioisotopes), comprising a nucleotide sequence 
substantially corresponding to, or complementary to, a > 8 nucleotide portion 
of the nucleotide sequence shown at Figure 3B from nucleotide 34 to 

15 nucleotide 8424. 

Such probe molecules may be DNA or RNA. They may be used, for 
example, to quantitatively or qualitatively detect EDD mRNA in total or 
poly(A) RNA isolated from one or more tissues. As discussed below, such 
assays may have diagnostic and/or prognostic value. 

20 The present invention also further extends to oligonucleotide primers 

for the above sequences, antisense sequences and homologues of said 
primers and antisense sequences, complementary ribozyme sequences, 
catalytic antibody binding sites and dominant negative mutants of the 
polynucleotide molecules. 

25 Preferably, the polynucleotide molecule of the first aspect is of human 

origin. More preferably, the polynucleotide molecule is of human cancer cell 
origin. 

The isolated polynucleotide molecule of the first aspect may be 
incorporated into plasmids or expression vectors or cassettes, which may 
30 then be introduced into suitable bacterial, yeast, insect or mammalian host 
cells. Such host cells may be used to express the protein or biologically 
active fragment thereof encoded by the isolated polynucleotide molecule. 

As mentioned above, the amino acid sequence of the EDD product 
(pEDD) shows significant sequence similarity to the amino acid sequence of 
35 the HYD protein of Drosophila. The Drosophila hyd gene is a tumour 

suppressor gene and it is therefore expected that the EDD gene is similarly a 
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tumour suppressor gene. Further, it is expected that the pEDD protein will 

have activity similar to the HYD protein. Particularly, inactivating or other 

mutations in EDD may give rise to susceptibility to cancer, thus making 

EDD a potential target for preventive or therapeutic strategies. Mutations in 
5 EDD could also be diagnostic for cancer susceptibility, particularly for early 

diagnosis in normal or pre-neoplastic disease or be useful in predicting 

tumour progression or response to therapy (i.e. a prognostic marker). 

Further, since EDD is likely to be involved in cell cycle regulation by 

progestins and other mitogens, EDD is a potential target for antiproliferative 
10 agents (i.e. cancer therapeutics). Moreover, as EDD is one of only a few 

known genes to be regulated by progestins, EDD is an important mediator of 

progestin action and a marker of clinical responsiveness to progestins. 
As a tumour suppressor gene, EDD could be a familial cancer 

susceptibility gene, for example, like pl6 (Multiple Tumor Suppressor Gene 
15 1, MTSl) or the familial breast cancer susceptibility gene BRCAl. It might 

also have a role in sporadic cancer. 

In a third aspect, the present invention provides in a substantially pure 

form, a protein (designated pEDD) comprising the following N-terminal 

amino acid sequence: 
20 MTSIHFWHP 

or a biologically active portion of said protein. 

Preferably, the protein of the third aspect comprises the following N- 

terminal amino acid sequence: 

MTSIHFWHPIJ 5 GTEDQLNDRLREVSEK^ 
25 More preferably, the protein of the third aspect is a ubiquitin-protein 

ligase and has an approximate molecular weight of 300kDa. 

Most preferably, the protein of the third aspect comprises an amino 

acid sequence substantially corresponding to the amino acid sequence shown 

in Figure 3C. 

30 The biologically active portions may consist of polypeptide or peptide 

sequences which inhibit, mimic or enhance the biological effect of the 
protein. Additionally, the biologically active portions may also represent 
antigenic determinants useful for raising antibodies specific to the protein. 
The protein, or biologically active portion thereof, according to the 

35 third aspect may be purified from natural sources (e.g. whole brain, heart, 



WO 98/48010 PCT/AU98/00280 

5 

testis and appendix) or suitable cell lines, or may be produced recombinantly 
by any of the methods common in the art (Sambrook et al. 9 1989). 

In a fourth aspect, the present invention provides a non-human 
organism transformed with the polynucleotide molecule of the first aspect of 
5 the present invention. 

The organisms which may be usefully transformed with the 
polynucleotide molecule of the first aspect include bacteria such as E. coli 
and B. subtilis, eukaryotic cell lines such as CHO, fungi and plants. 

In a fifth aspect, the present invention provides an antibody specific to 
10 the protein designated pEDD or an antigenic portion thereof. 

The antibody may be polyclonal or monoclonal and may be produced 
by any of the methods common in the art. 

It is also to be understood that the invention relates to kits for 
diagnostic assays, said kits comprising a protein or biologically active portion 
15 thereof according to the second aspect and/or an antibody according to the 
fifth aspect. Additionally, or alternatively, the kit may comprise 
oligonucleotide probes for hybridisation assays or oligonucleotide primers for 
PGR based assays. 

In a sixth aspect, the present invention provides a protein or antigenic 
20 portion thereof, capable of binding to an anti-pEDD antibody. 

As will be seen hereinafter, in some tissues EDD appears to be 
regulated by progestin. EDD may, therefore, provide a useful marker for 
progestin-responsiveness in a subject. For example, as a marker of breast or 
endometrial tumour or meningioma responsiveness to progestins or progestin 
25 antagonists (antiprogestins) - i.e. high levels may indicate that the tumour is 
responsive to progestins/antiprogestins and could be sensitive to 
progestin/antiprogestin therapy. EDD may also be a useful prognostic marker 
since hormonally responsive tumours often have a better prognosis (i.e. 
patients have longer disease-free survival and overall survival). 
30 Alternatively, mutations, deletions or amplification of the EDD gene might 
predict tumour progression, and disease prognosis independent of its role a 
progestin-regulated gene. Thus, levels of EDD mRNA present in isolated 
cells or tissue samples may be assessed by DNA or RNA probes or primers in 
hybridisation assays or PCR analysis. Alternatively, the level of pEDD 
35 protein may be assessed through the use of the abovementioned antibodies. 
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Thus, in a seventh aspect, the present invention provides as assay for 
assessing progestin-responsiveness in a subject comprising the steps of; 

(i) isolating cells or tissue from said subject; and 

(ii) detecting the presence of a protein comprising an amino acid 
sequence substantially corresponding to that shown at Figure 3C . 

In some circumstances, it may be preferred to expose the isolated cells 
or tissue to progestin or agonist or antagonist compounds and, subsequently, 
determine whether the progestin or agonist or antagonist compound has 
induced the production of the pEDD protein. 

In an eighth aspect, the present invention provides a method for the 
diagnosis or determination of a predisposition to hyperproliferative disease, 
especially cancer, comprising detecting in a subject a polymorphism or 
alteration in the EDD gene which is indicative of said hyperproliferative 
disease or a predisposition to said hyperproliferative disease or 
developmental abnormality. 

The modulation of EDD activity may also have therapeutic utility in 
the treatment of proliferative disorders, such as malignant or non-malignant 
hyperproliferative disease (e.g. breast and other cancers), and dermatological 
diseases or developmental abnormalities. Further, modulation of EDD may 
be of therapeutic value in processes involving progestin action in progestin 
target organs (e.g. fertility control, and reproductive tissue function). 

EDD activity could be regulated by: 

- synthetic compounds, either stimulatory or inhibitory (i.e. agonists or 
antagonists), 

- ribozymes specific for EDD (i.e. to down-regulate endogenous EDD 
activity), and 

- gene therapy using expression vectors or oligonucleotides or other 
delivery systems (e.g. viral) containing a nucleotide sequence coding for EDD 
sense (i.e. to augment endogenous pEDD protein levels and activity) or 
antisense (i.e. to down-regulate endogenous pEDD protein levels and 
activity). Sense vectors could contain only a portion of the EDD coding 
sequence if separate desirable activities are found to reside in separate 
portions of the protein. Such vectors could also include dominant negative 
mutants of EDD which encode a gene product causing an altered phenotype 
by, for example, reducing or eliminating the activity of the endogenous pEDD 
protein. This might be caused through the interuption of formation of 
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enzyme complexes, substrate competition or the formation of a defective 
substrate or reaction product Particular examples of dominant negative 
mutants may be mutants that encode truncated proteins retaining pEDD 
sequences involved in protein-protein interactions or substrate recognition 
5 but which lack enzymatic or other activities residing elsewhere in the pEDD 
protein. Expression of such mutants would inhibit correct substrate 
modification or processing. Thus as a putative ubiquitin-protein ligase, 
truncated pEDD proteins could be expressed which allow the binding of 
protein substrates but which lack the sequences necessary for the subsequent 

10 ubiquitination and destruction of these sequences. 

Since the pEDD protein seems likely to be involved in cell cycle 
(growth) regulation including cell proliferation, differentiation and cell 
death, the pEDD protein or an agonist or antagonist might be used as a 
chemoprotectant in cancer chemotherapy treatments. That is, the pEDD 

15 protein or agonist/antagonist may be administered to a patient so as to stop 
the cell cycle including cell proliferation, differentiation and cell death in 
normal cells prior to treatment with standard cancer drugs (e.g. methotrexate, 
vinblastine and cisplatin). The arrested cells would thereby be less prone to 
damage by chemotherapy toxicity. 

20 The term "substantially corresponding" as used herein in relation to the 

nucleotide sequence is intended to encompass minor variation(s) in the 
nucleotide sequence which due to degeneracy in the DNA code do not result 
in a change in the encoded protein. Further, this term is intended to 
encompass other minor variations in the sequence which may be required to 

25 enhance expression in a particular system but in which the variation(s) do 
not result in a decrease in biological activity of the encoded protein. 

The term "substantially corresponding" as used herein in relation to 
amino acid sequence is intended to encompass minor variations in the amino 
acid sequence which do not result in a decrease in biological activity of the 

30 encoded protein. These variation(s) may include conservative amino acid 
substitution(s). The substitution(s) envisaged are:- 

G,A,V,I,L,M; D, E; N,Q; S,T; K,R,H; F,Y,W,H; and P,Na-alkalamino acids. 

The terms "comprise", "comprises" and "comprising" as used 
throughout the specification, are intended to refer to the inclusion of a stated 
35 step, component or feature or group of steps, components or features with or 
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without the inclusion of a further component or feature or group of steps, 
components or features. 

The invention will hereinafter be further described by way of the 
following non-limiting example and accompanying figures. 

Brief description of the accompanying figures : 

Figure 1. Identification of a differentially expressed cDNA in T-47D cells 
treated with the synthetic progestin ORG 2058. 

A) Identification of EDD by differential display. Total RNA obtained 
from T-47D cells treated with ORG 2058 or vehicle control (ethanol) for 3 h 
was used as a template for differential display PCR reactions. The PCR 
products were separated on a 4.5% polyacrylamide denaturing gel and 
visualized by autoradiography. The arrow indicates the EDD DD-PCR 
product (DD5-1; see Fig. 3A) which is present at a higher level in the 
progestin treated (ORG 2058) compared with control lane. 

B) Confirmation of the progestin induction of EDD by Northern blot 
analysis. T-47D cells proliferating in medium supplemented with 5% 
charcoal- treated FCS were treated with 10 nM ORG 2058 or ethanol vehicle 
(CONTROL) in the presence or absence of actinomycin D (ACT) and after 3 h 
total RNA was harvested for Northern analysis. The Northern blot was 
probed with the EDD clone P19. 

C) Effect of cycloheximide on progestin induction of EDD mRNA. T- 
47D cells proliferating in medium supplemented with 5% charcoal- treated 
FCS were treated with ORG 2058 (10 nM), cycloheximide (CHX, 20 /ig/ml), 
ORG 2058 and CHX simultaneously or ethanol vehicle and harvested for total 
RNA at 1 h. The Northern blot was probed with the EDD DD-PCR fragment 
DD5-1. 

Figure 2. Expression of EDD mRNA in human tissues. 

A) Northern blot analysis of polyA + RNA from human tissues. The 
blot was hybridized with the P19 cDNA clone of EDD. Molecular sizes of 
markers are indicated. PBL, peripheral blood leukocytes. 

B) Dot blot analysis of polyA 4 " RNA from human tissues. The blot 
was hybridized with the P19 cDNA clone of EDD. Row A: 1, whole brain; 2, 
amygdala; 3, caudate nucleus; 4, cerebellum; 5, cerebral cortex; 6, frontal 
lobe; 7, hippocampus; 8, medulla oblongata; Row B: 1, occipital lobe; 2, 
putamen; 3, substantia nigra; 4, temporal lobe; 5, thalamus; 6, sub-thalamic 
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nucleus; 7, spinal cord; Row C: 1, heart; 2, aorta; 3, skeletal muscle; 4, colon, 
5, bladder; 6, uterus; 7, prostate; 8, stomach; Row D: 1, testis; 2, ovary; 3, 
pancreas; 4, pituitary gland; 5, adrenal gland; 6, thyroid gland; 7, salivary 
gland: 8, mammary gland; Row E: 1, kidney; 2, liver; 3, small intestine; 4, 
5 spleen; 5, thymus; 6, peripheral leukocyte; 7, lymph node; 8, bone marrow; 
Row F: 1, appendix; 2, lung; 3, trachea; 4, placenta; Row G: 1, fetal brain; 2, 
fetal heart; 3, fetal kidney; 4, fetal liver; 5, fetal spleen; 6, fetal thymus; 7, 
fetal lung. 

Figure 3. Cloning and predicted amino acid sequence of EDD. 

10 A) A schematic representation of EDD structure with a restriction map 

for the EDD cDNA indicating the sites used for cloning the full-length EDD 
construct and the cDNA clones used to derive the EDD sequence shown 
beneath. The DD-PCR cDNA fragment identified by differential display was 
designated DD5-1 and a cDNA clone derived from the 5' RACE product and 

15 the original DD-PCR product, DD5-2. All cDNA clones were isolated from a 
human placenta cDNA library with the exception of Hi which was isolated 
from a human heart cDNA library. 

B) The nucleotide sequence of EDD. The start and stop codons are 
underlined. 

20 C) Predicted amino acid sequence of pEDD. There are two regions 

with high homology (—60%) to HYD (a central sequence and a carboxyl 
sequence containing the HECT domain) and these and other highly 
conserved sequences are shown in bold type, while two putative nuclear 
localization signals are boxed. The HECT domain is underlined and in bold 

25 type and includes a conserved cysteine at residue 2768 (boxed). A region 
showing homology to polyA-binding proteins is italicized and the peptide 
sequence to which antiserum AbPEPl was raised is underlined. The 
numbers refer to positions of amino acids. 
Figure 4. Chromosomal localization of the EDD gene. 

30 Metaphase showing FISH with the Hi probe. Normal male 

chromosomes were stained with DAPI. Hybridization sites on chromosome 8 

are indicated by an arrow. 

Figure 5. Characterization of EDD protein. 

A) Detection of recombinant EDD protein with AbPEPl. Sf9 cells 

35 infected with baculovirus containing a truncated EDD construct (EDD 100 
kDa) were boiled in SDS-sample buffer prior to SDS-PAGE through a 6% gel, 
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transferred to nitrocellulose and blotted with AbPEPl or AbPEPl peptide- 
blocked. 

B) Determination of the size of the EDD protein. EDD was 
immunoprecipitated from T-47D lysate using AbPEPl. The 

5 immunoprecipitate (IP) was resolved by SDS-PAGE through a 6% gel 
alongside the products of in vitro translated full length EDD (IVT) and 
immunoprecipitated in vitro translated EDD (IVT-IP). The T-47D 
immunoprecipitate was transferred to nitrocellulose and blotted for EDD with 
AbPEPl while the remainder of the gel was dried and autoradiographed. 
10 Molecular masses of marker proteins are indicated. 

C) Detection of EDD protein in T-47D lysates. Immunoprecipitated 
EDD was run alongside 40 fig total protein from T-47D lysate. Total proteins 
were blotted with either AbPEPl or peptide-biocked AbPEPl and the 
immunoprecipitate was blotted with AbPEPl. 

15 Figure 6. EDD protein expression in human tissues and cell lines. 

Expression of EDD in normal breast and breast cancer cell lines. Total 
cell lysates from a range of cell lines were separated by SDS-PAGE through a 
6% gel, transferred to nitrocellulose and blotted with AbPEPl. 184 is a 
normal breast cell line, 184B5 an immortalized derivative, and the remainder 

20 are breast cancer cell lines, MCF-7M being a sub-line of MCF-7. 
Figure 7. Sequence of the rat 100 kDa protein cDNA. 

Autoradiograph of the sequencing gel obtained when one clone was 
sequenced using the EDD-specific FC2 primer, with the sequence (a) listed 
alongside the autoradiograph. The published sequence (b,) is shown 

25 alongside and the missing base denoted by an asterisk. 
Figure 8. Ubiquitin thiol ester formation by EDD. 

In vitro translation of truncated (A) or full-length (B) EDD wild type or 
mutant (C2768A) protein in the presence of 35 S-methionine was followed by 
a 10 min incubation at 25 °C either with or without purified GST-ubiquitin 

30 (or GST in part A) fusion protein. Samples were resolved by SDS-PAGE (A, 
7% gel; B, 6% gel) following either incubation at 25 °C for 20 min in non- 
reducing sample buffer containing 4 M urea or boiling in sample buffer 
containing 100 mM DTT. Ubiquitin- and GST-ubiquitin-bound forms are 
marked with arrows. 
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Example : 

MATERIALS AND METHODS 
Reagents 

Steroids and growth factors were obtained from the following sources: 
5 ORG 2058 (16a-ethyl-21-hydroxy-19-norpregn-4-en-3,20-dione), Amersham 

Australia Pty Ltd, Sydney, Australia; human transferrin, Sigma Chemical Co., 
St. Louis, Mo.; and human insulin, Actrapid, CSL-Novo, North Rocks, Australia. 
Steroids were stored at -20 °C as 1000-fold-concentrated stock solutions in 
absolute ethanol. Cycloheximide (Calbiochem-Behring Corp., La Jolla, CA) was 
10 dissolved at 20 mg/ml in water and filter sterilized. Actinomycin D (Cosmegen, 
Merck Sharp and Dohme Research Pharmaceuticals, Rahway, NJ) was dissolved 
at 0.5 mg/ml in sterile water and used immediately. Tissue culture reagents 
were purchased from standard sources. 
Cell culture 

15 The sources and maintenance of the human breast cancer and normal cell 

lines used were as described previously (12, 22), as were tissue culture 
experiments (12). Briefly, progestin (ORG 2058, 10 nM) and/or cycloheximide 
(20 /xg/ml) or actinomycin D (5 /xg/ml) was added to the medium and control 
flasks received the same volume of vehicle alone. To obtain RNA for differential 

20 display, cells were grown in insulin-supplemented serum-free medium and 
treated for 3 h with ORG 2058 or ethanol vehicle. Subsequent progestin 
stimulation experiments were carried out in medium containing 5% charcoal- 
stripped fetal calf serum without insulin. 
RNA isolation and Northern analysis: 

25 Cells harvested from duplicate 150 cm 2 flasks were pooled, RNA 

extracted by a guanidinium-isothiocyanate-cesium chloride procedure and 
Northern analysis was performed as previously described with 20 fxg of total 
RNA per lane (3, 23). The membranes were hybridized overnight (50 °C) with 
probes labelled with [a- 32 P]dCTP (Amersham Australia Pty Ltd) using a Prime- 

30 a-Gene labelling kit (Promega Corp., Sydney, Australia). The membranes were 
washed at a highest stringency of 0.2 ¥ SSC (30 mM NaCl, 3 mM sodium citrate 
[pH 7.0J) / 1% sodium dodecyl sulfate at 65 °C and exposed to Kodak X-OMAT 
or BIOMAX film at -70 °C. Human multiple tissue Northern blots or RNA Master 
blot (CLONTECH Laboratories Inc., Palo Alto, CA) were hybridized under 

35 conditions recommended by the manufacturer. The mRNA abundance was 
quantitated by densitometric analysis of autoradiographs using Molecular 
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Dynamics Densitometer and software (Molecular Dynamics, Sunnyvale, CA). 
The accuracy of loading was estimated by re-hybridizing membranes with a [g- 
32 P]ATP end-labelled oligonucleotide complementary to 18S rRNA (24, 25). 
Differential display 

5 Differential display was carried out as previously described (11) using a 

Heiroglyph mRNA Profile Kit No. 1 (Genomyx Corporation, Foster City, CA) and 
recommended protocol. First strand cDNA synthesis was carried out in 96-well 
format 0.2 ml thin walled tubes. Typically 200 ng total RNA from T-47D cells 
treated with the synthetic progestin ORG 2058 for 3 h or from control T-47D 

10 cells was reverse transcribed with Expand Reverse Transcriptase enzyme 

(Boehringer Mannheim Pty Ltd, Castle Hill, Australia) following annealing with 
4 pmol anchored primer (5 , ACGACTCACTATAGGGCTi2AC). Subsequent PCR 
amplification was performed with one-tenth of the resultant cDNA in duplicate 
reactions containing [a- 33 P] dATP with the anchored primer (0.2 /jM), an 

15 arbitrary primer (5'ACAAnTCACACAGGAGCTAGCAGAC, 0.2 fuM) and 

Expand Long Template Taq DNA Polymerase (Boehringer Mannheim). The PCR 
products were denatured and separated on a 4.5% denaturing polyacrylamide 
gel at 800 v for 16 h using the Genomyx Long Read Sequencing System reagents 
and apparatus. The gel was dried on the glass plate and exposed to X-ray film 

20 for 16-72 h. The DD-PCR product of interest was excised from the gel and 

amplified by PCR under the conditions recommended by the kit manufacturer 
using an M13 forward primer (5 1 AGCGGATAACAATTTC AC AC AGG A) and a T7 
promoter primer (5 TAATACG ACTC ACTATAGGG) . The reamplified PCR 
products were purified from 0.8% agarose gels using QIAEX reagents (Qiagen 

25 Pty Ltd, Clifton Hill, Australia). 

Cloning and sequencing of cDNAs 

Double stranded DNA templates were sequenced using thefinol DNA 
Cycle Sequencing System (Promega Corp.) with [ 33 P]-labelled primers. The 
M13 primer was used for direct sequencing of DD-PCR products and the T7 

30 and SP6 (5 'GATTTAGGTG AC ACTATAG) promoter primers were used for 
sequencing PCR products cloned into the pGEM-T vector (Promega Corp.). 
Sequence database searches were performed at the NCBI using the Blast or 
Fasta network services. Peptide motif searches were carried out against the 
Prosite database. 

35 Two primers (FC2: 5'GACGAAGGGCCCTGACTGCGCGAGAAGAAGC and 
R2: 5 , AAAGAATTCTGTCATGGAGTCTGAACGTCG) that flank the region 
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containing the reported rat 100 kDa start codon (26) were used to amplify 
cDNA extracted from a rat hypothalamus library (CLONTECH). The resulting 
PCR product was cloned into pGEM-T (Promega Corp.) and four clones were 
sequenced. 

5 Rapid amplification of cDNA 5 ( ends (5'RACE) 

Additional sequence was obtained with the aid of a 5'RACE kit (Life 
Technologies Inc., Gaithersburg, MD), following the manufacturer's 
instructions. Briefly, a gene specific primer (GSPl: 
5 'C ACGCTCCAATGC AAGCTGG) was used to prime first strand cDNA 

10 synthesis. Following removal of the RNA strand, cDNA was 5 ! poly dC tailed 

and amplified by PCR. The target cDNA was amplified using an anchor primer 
(UAP: S'GGCCACGCGTCGACTAGTACGGGIIGGGIIGGGnG, where I represents 
.deoxyinosine) in combination with a second gene specific primer (GSP2: 
5'CGATCTTCCCTGATTCGAGGTGGC). Various gel-purified PCR products were 

15 further PCR amplified, primed by UAP and a third gene specific nested primer 
(GSP3: 5 'CTGTATTGAC AATGCTCCACC) . 
cDNA library screening 

10 6 plaques from a human heart cDNA library in the Lambda ZAPII 
vector primed with both oligo (dT) and random primers (Stratagene, La Jolla, 

20 CA) were transferred to nylon membranes (Hybond N, Amersham Australia Pty 
Ltd) and screened with both the original DD-PCR fragment and the RACE 
product as [ 32 P]-labelled probes. This led to isolation of clone Hi (2.55 kb). 
This clone and the RACE product were used to screen 10 6 recombinants from a 
human placenta S'-STRETCH PLUS cDNA library in IgtlO primed with both 

25 oligo (dT) and random primers (CLONTECH Laboratories, Inc.). Sequencing of 
cDNA clones in either pBluescript or IgtlO was carried out as described above 
using vector-specific or gene-specific primers. Several rounds of isolation of 
positive clones and further screening of this library led to the isolation of the 
following overlapping clones covering the entire EDD open reading frame: P61 

30 (1.95 kb), P43 (2.1 kb), PI (1.5 kb), P19 (3 kb) and P47 (2.1 kb). 
Fluorescence in situ hybridization 

A probe corresponding to clone Hi was nick-translated with biotin-14- 
dATP and hybridized in situ at a final concentration of 20 ng/ml to metaphases 
from two normal males. The fluorescence in situ hybridization (FISH) method 

35 was modified from that previously described (27) in that chromosomes were 

stained before analysis with both propidium iodide (as counterstain) and DAPI 
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(for chromosome identification). Images of metaphase preparations were 
captured by a CCD camera using the Cyto Vision Ultra image collection and 
enhancement system (Applied Imaging Int Ltd), FISH signals and the DAPI 
banding pattern were merged for figure preparation. 
5 Construction of recombinant cDNA clones for in vitro translation and protein 
expression 

The full length EDD sequence was cloned by ligating three PCR products . 
which spanned the open reading frame into pBluescript The existing Sail and 
EcoRL restriction sites used to ligate the fragments are indicated in Fig. 3A. The 

10 carboxyl third of the cDNA was cloned into pBluescript such that an 890 amino 
acid truncated protein corresponding to the predicted rat 100 kDa protein (from 
aa 1910 to aa 2799) would be translated. An identical truncated cDNA fragment 
was cloned into the pFASTBAC 1 expression vector (Life Technologies Inc.) for 
protein expression using the B AC-TO-BAC baculovirus expression system in 

15 Spodoptera firugiperda (S/9) cells and full length EDD cDNA was cloned into the 
pRcCMV expression vector (Invitrogen, Leek, The Netherlands) for transient 
transfection into HEK-293 cells. Mutagenesis of cysteine 2768 to alanine was 
performed for full length and truncated constructs in pBluescript using the 
Quick-Change site-directed mutagenesis kit (Stratagene). In vitro transcription 

20 and translation were performed using the TNT T7 Quick coupled rabbit 
reticulocyte lysate system (Promega Corp.) and [ 35 S] -methionine (1000 
Ci/mmole, ICN Biomedicals Australasia Pty Ltd, Seven Hills, Australia). 
SDS-polyacrylamide gel electrophoresis (PAGE) and immunoblotting 

Cells growing in mid-log phase were lysed in 1% Triton X100 buffer 

25 containing 50 mM 4-(2-Hydroxyethyl)-l-piperazineethanesulfonic acid (HEPES; 
pH 7.5), 150 mM NaCl, 10% glycerol, 1.5 mM MgCl2, 1 mM EGTA, 10 mM 
sodium pyrophosphate, 20 mM sodium fluoride, 1 mM dithiothreitol (DTT), 10 
fjLg/ml each of aprotonin and leupeptin, 1 mM phenylmethylsulfonyl fluoride 
(PMSF) and 200 juM sodium ortho vanadate. Lysates were cleared by 

30 centrifugation, quantitated according to a modified Bradford method (Bio-Rad 
Laboratories, Hercules, CA) and typically 40 fig of total protein in SDS-sample 
buffer (50 mM Tris-HCl, pH 6.8, 2% SDS, 10% glycerol, and 0.2% bromophenol 
blue) containing 5% b-mercaptoethanol were resolved on 6% SDS- 
polyacrylamide gels. Following electrophoresis proteins were transferred to 

35 nitrocellulose (TransBlot, Bio-Rad Laboratories) and subjected to 

immunodetection. An EDD-specific peptide (SSEKVQQENRKRHGSS) was 
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synthesised, coupled via glutaraldehyde to diptheria toxoid and used to 

generate a rabbit anti-EDD antibody (designated AbPEPl). 

Immunoprecipitation 

Cleared cell lysates (typically 1 mg total protein) or in vitro translation 
5 reactions were incubated with either control rabbit serum or AbPEPl in the 

presence or absence of a 10-fold excess of competing peptide for 1-2 hr at 4 °C. 

Following incubation with Protein A Sepharose 4B (Zymed, San Francisco, CA), 

immunoprecipitates were washed three times in 1% Triton X100 lysis buffer 

described above, resolved by SDS-PAGE and either transferred to nitrocellulose 
10 and immunoblotted with AbPEPl or where applicable dried onto Whatman 3mm 

paper and subjected to autoradiography. 

Ubiquitin-binding assay 

[ 35 S]-labelled in vitro translated truncated (—100 kDa) or full length 

protein was tested for its ability to bind ubiquitin by incubating 5 fil translation 
15 reaction with or without 5 /Ltg purified GST protein or GST-ubiquitin fusion 

protein for 10 min at 25 °C (28). Reactions were terminated by incubating the 

mixtures in either SDS-sample buffer containing 100 mM DTT at 95 °C for 5 min 

or in SDS-sample buffer containing 4 M urea instead of DTT at 25 °C for 20 min. 

Samples were resolved by SDS-PAGE through 6% or 7% gels followed by drying 
20 and autoradiography. 

RESULTS 

Isolation and Northern blot analysis of a progestin regulated cDNA 

The differential display technique was used to identify mRNAs in T- 
47D human breast cancer cells with altered levels of expression in response 
25 to treatment with the synthetic progestin ORG 2058 for 3 h. When the 
anchored primer, 5 1 ACG ACTC ACTAT AGGGCTi 2 AC was used in 
conjunction with the arbitrary primer, 

5 ACAATTTCACACAGGAGCTAGCAGAC, a cDNA fragment of 
approximately 850 bp that was more abundant in treated samples than in 

30 control samples was identified and designated EDD (Fig. 1A). Northern 

analysis of total cellular RNA from T-47D cells showed that transcription was 
required for the observed ORG 2058 induction of EDD mRNA levels as this 
was blocked in the presence of actinomycin D (Fig. IB). Induction was also 
prevented by cycloheximide, suggesting that EDD is not directly 

35 transcriptionally regulated by progestin acting via the PR (Fig. 1C). 
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The tissue specificity of EDD gene expression was investigated by 
hybridizing Northern blots of polyA + RNA isolated from human tissues to 
the EDD cDNA fragment. A single transcript of 9.5 kb was detected in a 
variety of tissues (Fig. 2A) with the highest expression in testis, heart, 
5 placenta and skeletal muscle. Hybridization to a more quantitatively loaded 
RNA dot blot (Fig. 2B) confirmed that EDD is expressed at varying levels in 
all tissues examined and that the mRNA was most abundant in testis and 
expressed at high levels in brain, pituitary and kidney. Significant levels of 
expression were also observed in placenta, uterus, prostate, stomach, fetal 
10 lung and various brain tissues. EDD mRNA was also expressed in a range of 
breast cancer cell lines, not all of which are progestin-responsive (not 
shown). 

Cloning of the full length EDD cDNA 

The original DD5-1 fragment isolated by DD PCR was 850 bp in length 

15 and is shown schematically in Figure 3 A. The DNA sequence of this fragment 
had no homology to sequences of any known human genes. To obtain the 
complete coding sequence from which EDD was derived a combination of 
5'RACE and screening of human heart and placenta cDNA libraries was used. 
This resulted in a series of overlapping clones covering 8.5 kb of sequence 

20 (Fig. 3 A; Genbank Accession AF006010). Analysis of the nucleotide sequence 
(Fig. 3B) revealed an open reading frame of 2799 amino acids (Fig. 3C). The 
EDD sequence was divided into overlapping 1800 bp segments and used in 
Blastx searches of the GenBank database. The only homology to a human 
sequence of known function was to polyA binding protein across 50 amino 

25 acids (50%, Fig. 3C) although the similarities among mammalian polyA 
binding proteins in this stretch are usually in the vicinity of 100%. 

The DNA sequence of EDD showed significant similarity to two 
sequences in the database. Both of these genes encode proteins belonging to 
the HECT family of ubiquitin-protein ligases, although their specificities are 

30 unknown. HECT proteins contain a conserved domain of approximately 300 
amino acids that contains a cysteine residue able to bind ubiquitin via a 
thioester linkage. Nucleotides 5667 to 8502 of EDD were 88% identical to the 
rat 100 kDa protein cDNA sequence (26), nucleotides 572 to 740 and 3498 to 
3867 were 69% identical to two regions of the Drosophila melanogaster 

35 hyperplastic discs gene [hyd] and nucleotides 7560 to 8430 were 60% 
identical to a third region of hyd (13). The putative initiation codon is 
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surrounded by a consensus sequence for strong translational initiation 
(ACCATGA, (29)) and corresponds to a possible start codon of the Drosophila 
hyd gene (13). The stop codon corresponds to that shared by the rat 100 kDa 
protein and hyd genes. Like EDD, both the hyd and rat 100 kDa protein genes 
5 have estimated mKNA transcript sizes of 9.5 kb (14, 26). The predicted EDD 
protein is identical to HYD at 40% of amino acid residues and similar at 64% 
of residues, while the carboxyl third of EDD is 96% identical and 98.5% 
similar to rat 100 kDa protein. The most highly conserved regions between 
HYD and EDD are designated by bold type in Figure 3C. Within two of these 

10 regions there are stretches of 40-80 amino acids that are highly conserved 
between HYD, EDD and a possible C. elegans homologue of HYD contained 
within 2 overlapping cosmids (Genbank Accession No. G1729554 and 
G1729549). The longest conserved regions between EDD and HYD are a 
central domain of approximately 400 amino acids (58% identity, 72% 

15 similarity) and the carboxyl 300 amino acids which include the HECT 

domain and conserved cysteine residue (64% identity, 80% similarity). This 
latter region also showed around 30% identity and 50% similarity with other 
HECT proteins including yeast RSP5 or PUB-1 and RAD26 (14, 30, 31), and 
the mammalian proteins UreBl (19), Nedd-4 (15, 20, 32, 33) and E6-AP (15, 

20 17, 18). Apart from two putative nuclear localization signals (34), no other 
consensus functional domains were identified within the EDD sequence. 
Chromosomal localization of the EDD gene 

FISH was used to localize the gene for EDD. Eighteen metaphases from 
a normal male were examined for fluorescent signal. Seventeen of these 

25 metaphases showed signal on one or both chromatids of chromosome 8 in the 
region q22. High resolution studies of 8 metaphases showed signal at q22.3 
(Fig. 4). There was a total of 4 non-specific background dots observed in 
these 18 metaphases. A similar result was obtained from hybridization of the 
probe to 11 metaphases from a second normal male (data not shown). This 

30 localization was consistent with independent assignment of an EST 
corresponding to EDD (EST116344) using the radiation hybrid panel 
Genebridge 4. 

Characterization of EDD protein 

A rabbit antiserum (AbPEPl) against an EDD-specific peptide 
35 matching a sequence towards the carboxyl terminus of the protein 

(underlined in Figure 3C) reacted strongly on Western blots with a truncated 
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(100 kDa) recombinant EDD protein expressed in S/9 cells using a 
baculovirus system (Fig. 5A). A second strongly reactive band of 
approximately 200 kDa was also seen, but this appeared to be non-specific as 
antibody binding was not competed by the EDD peptide. The full length EDD 
5 cDNA was cloned into pBluescript and translated in vitro in a rabbit 

reticulocyte lysate system. The size of the major product was in agreement 
with the expected molecular mass of the protein as predicted from the amino 
acid sequence (-300 kDa, Fig. 5B). The identity of the translated protein was 
confirmed by immunoprecipitation from either translation reactions or T-47D 

10 whole cell lysates with AbPEPl (Fig. 5B). Western blotting of whole cell 
lysates from T-47D cells using AbPEPl detected two major bands, both 
abolished in the presence of competing peptide - a major species at 
approximately 230 kDa and a minor species of higher molecular mass (Fig. 
5C). This latter band corresponds in size to that of the in vitro translated 

15 protein and is immunoprecipitated by AbPEPl (Fig. 5C) and by two other 

EDD-specific peptide antibodies (not shown). However, the 230 kDa protein 
is not immunoprecipitated from cell lysates by these antibodies. As a single 
EDD mRNA transcript was detected on Northern blots, it was hypothesised 
that the EDD protein may be processed to the 230 kDa form which could be 

20 folded in such a way that was not susceptible to immunoprecipitation in its 
native state. However, transient expression of full length EDD in HEK-293 
cells followed by Western blotting of whole cell lysates revealed an increase 
in the expression of the 300 kDa species only (not shown). Western blotting 
of whole cell lysates from a number of normal breast and breast cancer 

25 epithelial cell lines showed that EDD protein was expressed in all 

immortalized and cancer cell lines but not in a normal breast cell line, 184 
(Fig. 6). 

Identity of the rat gene product 

The previously described rat cDNA that is highly homologous to the 

30 EDD gene reportedly gives rise to a 100 kDa protein, inferred from cDNA 

sequence data which showed several in-frame stop codons upstream of the 
putative initiation codon (26), corresponding to amino acid residue 1910 of 
EDD. These stop codons were not present in the EDD cDNA. Furthermore, 
although we were able to confirm that an anti-HYD antibody detected an 

35 approximately 100 kDa protein in rat muscle lysates, this species was not 

detected by AbPEPl even though the predicted sequences of human and rat 
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proteins are identical at every residue of the peptide used to raise the 
AbPEPl antibody. This led the present inventors to question whether the 100 
kDa protein was the actual rat gene product. 

A segment of rat cDNA was cloned containing the stretch of sequence 
5 upstream of the proposed initiation codon and found an additional base that, 
by changing the reading frame, removes the upstream stop codons (Fig. 7). 
Correction of this apparent error results in a rat cDNA sequence that closely 
matches the human cDNA, in which a continuous open reading frame exists 
throughout the sequence. While the rat cDNA sequence corresponding to the 

10 amino terminal two-thirds of EDD has not been cloned, a number of mouse 
expressed sequences covering parts of this region are recorded in the 
GenBank database (Accession No. AA183561, AA177260, AA183970, 
AA231351, AA087561) and these show similar levels of similarity with the 
EDD DNA sequence as that seen with the published rat sequence. Thus it 

15 appears that the true product of the rat gene is not a 100 kDa protein but may 
exist as a larger species. In rat lysates, however, AbPEPl does not detect a 
protein having a molecular weight consistent with the human (EDD) and 
Drosophila (HYD) gene products. 
Ubiquitin binding by EDD 

20 A critical feature of the HECT family of E3 enzymes is their ability to 

reversibly form thioesters with ubiquitin at a conserved cysteine residue 
within the HECT domain. This property has been demonstrated for the HECT 
proteins human E6-AP, rat 100 kDa protein and yeast RSP5 where the 
thioester linkage remains intact in the absence of reducing agents but is 

25 broken in the presence of 100 mM DTT (14). Substitution of the conserved 
cysteine residue prevents ubiquitin thioester bond formation. However, this 
property has not been shown for the HYD protein. To assess the potential of 
EDD to function as an E3 we tested whether EDD could form a reversible 
bond with ubiquitin via the conserved cysteine, C2768. 35 S-labelled in vitro 

30 translated truncated protein (—100 kDa of carboxyl terminus sequence) was 
incubated with purified GST-ubiquitin fusion protein in the presence or 
absence of DTT before SDS-PAGE (Fig. 8A). 

In the absence of DTT an additional higher molecular mass protein 
band was observed that corresponded to the expected size of an EDD-GST- 

35 ubiquitin conjugate (~ 130 kDa, upper arrow in Fig. 8A). This species was 
abolished in the presence of 100 mM DTT suggesting involvement of a 
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thioester bond in its formation. This was confirmed by experiments with an 
in vitro translated protein containing a C2768A mutation: binding of GST- 
ubiquitin was not seen under these conditions (Fig. 8A). A species of slightly 
higher molecular mass than EDD was also observed (lower arrow in Fig. 8A), 
5 consistent with the formation of ubiquitin-EDD conjugates, ubiquitin being 
present as a component of the rabbit reticulocyte lysate. Again this was not 
observed using the mutant protein or in the presence of 100 imi DTT. Similar 
results were achieved with full length EDD protein obtained (though at lower 
yield) by in vitro translation (Fig. 8B). 

10 DISCUSSION 

Application of the differential display PCR technique to a cultured 
human breast cancer cell model in which clearly defined proliferative 
responses to progestins are observed has led to the identification of a novel 
gene, EDD, a likely human homologue of the Drosophila melanogaster 

15 tumor suppressor gene hyperplastic discs (13). EDD is also highly 

homologous to the partial published sequence for the cDNA encoding the 
rat 100 kDa protein (26). All three genes produce large (approx 9.5 kb) 
mRNAs and the predicted entire EDD open reading frame of 2799 amino 
acids shares 40% identity with that of Drosophila hyd while the carboxyl- 

20 terminal 889 amino acids of EDD share 96% identity with the rat protein. 
Western analysis showed that the EDD gene product is a protein of 
approximately 300 kDa. This protein is also immunoprecipitated by 3 
different peptide-specific EDD antibodies and also corresponds to the size 
of the major in vitro translated gene product. The large discrepancy in the 

25 predicted size of the human and rat proteins was apparently resolved by re- 
examination of the rat cDNA sequence which disclosed an error in the 
published translation start site, pointing to the likelihood that a larger gene 
product exists. 

At their carboxyl termini EDD, its rat homologue and HYD all contain 
30 a highly homologous HECT domain, indicating membership of a larger 
family of proteins which function as ubiquitin protein ligases (E3s). The 
ubiquitination of target proteins occurs by the action of multiple interacting 
proteins: a ubiquitin-activating enzyme (El), ubiquitin-conjugating enzymes 
(E2) and ubiquitin-protein ligases (E3). Substrate specificity is largely 
35 determined by E3s, which bind and transfer ubiquitin to the target protein 
following interaction with specific E2s. The key feature of the HECT class 
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of E3s is their ability to covalently bind ubiquitin through a conserved 
cysteine residue located in their HECT domains (14). This property was 
demonstrated for EDD using in vitro translated protein that lost the ability 
to bind ubiquitin if the conserved cysteine (C2768) was substituted and it 
5 was therefore concluded that EDD is an E3. 

Few E3 genes have been cloned (only two from human) but others are 
likely to exist as ubiquitin-dependent proteolysis is involved in many 
cellular processes and targets many known proteins. Ubiqui tin-mediated 
proteolysis is critical in the control of cell cycle progression, being 

10 responsible for the periodic destruction of key cell cycle regulators 

including cyclins (35-37) and cyclin-dependent kinase inhibitors (38, 39) 
and also targeting transcription factors (40-43), the tumor suppressor 
protein p53 (18) and cell-cell signalling components such as b-catenin (44). 
Disruption of the murine Itch locus, which encodes an E3, caused 

15 hyperplasia in lymphoid and gastrointestinal epithelial tissues and an 

abnormal inflammatory response (21) while mutations in E6-AP in humans 
result in neurological abnormalities, indicating critical, and perhaps tissue 
specific, roles for E3 proteins (45). 

Although substrates for EDD and its rat and Drosophila homologs 

20 have yet to be defined, conservation between the central domain of EDD 
and that of HYD suggests that this region has an important function, 
perhaps in substrate recognition. For the yeast E3 Rsp5, substrate specificity 
is determined by the amino terminal domain and does not require the HECT 
domain (16). Alternatively, this region could be involved in the binding of 

25 as yet unknown E2 proteins that interact specifically with EDD. The mouse 
E3 Nedd4 has at least two distinct E2 binding domains, only one of which is 
within the HECT domain (15) while human E6-AP requires only the HECT 
domain for E2 recognition (46). As the protein produced from the truncated 
EDD construct still binds ubiquitin reversibly, at least some E2 recognition 

30 function is present in this carboxyl domain. Other possible functions of the 
conserved central domain include cellular localization or translocation 
between cytoplasm and nucleus, cofactor association or phosphorylation. 

Although ubiquitination is clearly involved in steroid-responsive 
processes such as regulation of cell cycle progression, specific regulation of 

35 ubiquitin pathways by steroid hormones has not previously been reported. 

The precise role of EDD in progestin action is unknown, particularly whether 
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it participates in those key early events that occur in response to this 
hormone and which are ultimately responsible for its effects on cell 
proliferation and differentiation. Progestin regulation of EDD mRNA, which 
requires de novo protein synthesis, is transient with maximal levels 3 to 4- 
5 fold above control observed at 6 h. This increase in EDD expression levels 
therefore precedes the increase in the S phase fraction of T-47D cells 
following ORG 2058 treatment under the same conditions, which typically 
occurs at 12 to 14 h (3) and hence is consistent with a possible role in control 
of cell cycle progression. Similar levels of EDD induction were observed in 

10 antiestrogen-arrested MCF-7 breast cancer cells treated with 17b-estradiol 
(not shown), suggesting this may be a generalized response to mitogens. 

However, given that EDD is also expressed in non-progestin target 
tissues, a more widespread role than specifically mediating progestin effects 
is expected. Information on the biological role of HYD from mutagenesis 

15 studies in Drosophila (13) may ultimately give clues as to the function of 

EDD. The null hyd phenotype is lethal, as are severe mutations in the pupal 
or larval stages. Less severe mutations result in overgrowth (hyperplasia) of 
larval imaginal discs (the larval centres of cell proliferation that give rise to 
adult structures such as wings, legs and antennae), apparently caused by a 

20 failure to terminate cell proliferation when the discs reach their 

characteristic size, hence the definition of hyd as a tumor suppressor gene. 
Surviving adults are sterile due to germ cell defects, and interestingly, high 
expression levels of EDD and rat 100 kDa protein mRNA are seen in human 
and rat testes, suggesting a critical function in this organ. 

25 Studies of a number of human homologues of Drosophila tumor 

suppressor genes strongly suggests that these genes have similar roles in both 
species in controlling cell proliferation, and that such genes can be important 
in human heritable and sporadic cancers, for example patched (47), 
mutations of which are linked to basal cell carcinoma, and discs large (45, 

30 48), a target of the APC gene which is mutated in sporadic colorectal tumors 
and familial adenomatous polyposis coli. The possible involvement of EDD 
in human tumorigenesis and tumor progression is therefore of particular 
interest. The EDD gene locus at chromosome 8q22 is often disrupted in a 
variety of cancers, being deleted in adenocarcinoma of the ovary and lung 

35 (49, 50), hepatocellular carcinoma (51) and head and neck squamous cell 

carcinoma (52), amplified in many tumor types including gastrointestinal and 
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primary breast cancers (53, 54) and involved in translocations in acute 
myeloid leukemia (55). Chromosome 8q22 is also a region affected in the 
human developmental disorder Klippel-Feil syndrome (56). 
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The abbreviations used in this specification are: DD-PCR, differential 
display polymerase chain reaction; DTT, dithiothreitol; EDD, E3 isolated by 
differential display; FISH, fluorescence in situ hybridization; GST, 
glutathione S-transferase; HECT, homologous to E6-AP carboxyl terminus; 
5 PAGE, polyacrylamide gel electrophoresis; PMSF, phenylmethylsulfonyl 

fluoride; PR, progesterone receptor; RACE, rapid amplification of cDNA ends. 

It will be appreciated by persons skilled in the art that numerous 
variations and/or modifications may be made to the invention as shown in 
10 the specific embodiments without departing from the spirit or scope of the 
invention as broadly described. The present embodiments are, therefore, to 
be considered in all respects as illustrative and not restrictive. 
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The Claims : 

1. An isolated polynucleotide molecule comprising a nucleotide sequence 
encoding a protein which comprises the following N-terminal amino 
acid sequence: 

MTSIHFWHP 
or a biologically active portion of said protein. 

2. A polynucleotide molecule according to claim 1, wherein the encoded 
protein comprises the following N-terminal amino acid sequence: 
MTSIHFVVHPIJGTE^ 

Q. 

3. A polynucleotide molecule according to claim 1 or 2, wherein the 
encoded protein is a ubiquitin-protein ligase and has an approximate 
molecular weight of 300kDa. 

4. A polynucleotide molecule according to any one of claims 1 to 3, 
comprising a nucleotide sequence > 90% homologous to the nucleotide 
sequence shown at Figure 3B from nucleotide 34 to nucleotide 8424 or 
a portion(s) thereof. 

5. A polynucleotide molecule according to any one of claims 1 to 3, 
comprising a nucleotide sequence > 95% homologous to the nucleotide 
sequence shown at Figure 3B from nucleotide 34 to nucleotide 8424 or 
a portion(s) thereof. 

6. A polynucleotide molecule according to any one of claims 1 to 3, 
comprising a nucleotide sequence substantially corresponding to the 
nucleotide sequence shown at Figure 3B from nucleotide 34 to 
nucleotide 8424 or a portion(s) thereof. 

7. An oligonucleotide or polynucleotide probe molecule labelled with a 
suitably detectable label, said probe molecule comprising a nucleotide 
sequence substantially corresponding to, or complementary to, a > 8 
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nucleotide portion of the nucleotide sequence shown at Figure 3B from 
nucleotide 34 to nucleotide 8424. 

8. An expression vector or cassette, said vector or cassette comprising a 
polynucleotide molecule according to any one of claims 1 to 6 operably 
linked to a promoter sequence. 

9. A non-human organism, said organism stably transformed with a 
polynucleotide molecule according to any one of claims 1 to 6. 

10. A non-human organism, said organism stably transformed with a 
expression vector or cassette according to claim 8. 

11. A protein comprising the following N-terminal amino acid sequence: 

MTSIHFWHP 

or a biologically active portion of said protein, said protein or 
biologically active portion thereof being in a substantially pure form. 

12. A protein according to claim 11, wherein said protein comprises the 
following N-terminal amino acid sequence: 
MTSIHFVVHPIJGTEDQ 

Q. 

13. A protein according to claim 11 or 12, wherein said protein is a 
ubiquitin-protein ligase and has an approximate molecular weight of 
300kDa. 

14. A protein according to any one of claims 11 to 13, wherein the protein 
comprises an amino acid sequence substantially corresponding to the 
amino acid sequence shown in Figure 3C. 

15. An antibody or fragment thereof which specifically binds to a protein 
according to any one of claims 11 to 14 or an antigenic portion thereof. 

16. A protein or antigenic portion thereof capable of binding to an anti- 
pEDD antibody. 
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17. An assay for assessing progestin-responsiveness in a subject, said 
method comprising the steps of; 

(i) isolating cells or tissue from said subject; and 

(ii) detecting the presence of a protein comprising an amino acid 
sequence substantially corresponding to that shown at Figure 3C . 

18. An assay according to claim 17, wherein before step (ii) the isolated 
cells or tissue is exposed to progestin or a progestin agonist or 
antagonist. 

19. An assay according to claim 16 or 17, wherein said step (ii) is 
conducted using an antibody or fragment thereof according to claim 
15. 

20. A method for the diagnosis or determination of a predisposition to 
hyperproliferative disease, said method comprising detecting in a 
subject a polymorphism or alteration in a gene comprising a nucleotide 
sequence substantially corresponding to the nucleotide sequence 
shown in Figure 3B from nucleotide 34 to nucleotide 8424, said 
polymorphism or alteration being indicative of said hyperproliferative 
disease or a predisposition to said hyperproliferative disease. 

21. A method according to claim 20, wherein said hyperproliferative 
disease is a cancer. 

22. A method according to claim 21, wherein said cancer is breast cancer. 
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FIGURE 3B 

1 CGCCCTCGAG TGGAGGACGA GAAGGAAAGC ACCATGACGT CCATCCATTT CGTGGTTCAC 
61 CCGCTGCCGG GCACCGAGGA CCAGCTCAAT GACAGGTTAC GAGAAGTTTC TGAGAAGCTG 
121 AACAAATATA ATTTAAACAG CCACCCCCCT TTGAATGTAT TCCAACAGGC TACTATTAAA 
Ia} ^l^l GG TGGGACCAAA TCATGCTGCC TTTCTTCTTG AGGATGGTAG AGTTTGCAGG 
" *Zl GG lZZT? CAGTACAGCC AGACAGATTG GAATTGGGTA AACCTGATAA TAATGATGGG 
I 0 * Z^tll^: ACAGCAACTC GGGGGCAGGG AGGACGTCAA GGCCTGGTAG GACAAGCGAC 
361 TCTCCATGGT TTCTCTCAGG TTCTGAGACT CTAGGCAGGC TGGCAGGCAA CACCTTAGGA 
421 AGCCGCTGGA GTTCTGGAGT GGGTGGAAGT GGTGGAGGAT CCTCTGGTAG GTCATCAGCT 
GG £ GC Z GG ^ G ATTCCCGCCG GCAGACTCGA GT TATTCGGA CAGGACGGGA TCGAGGGTCT 
541 GGGCTTTTGG GCAGTCAGCC CCAGCCAGTT ATTCCAGCAT CTGTCATTCC AGAGGAGCTG 
601 ATTTCACAGG CCCAAGTTGT TTTACAAGGC AAATCCAGAA GTGTCATTAT TCGAGAACTT 
661 CAGAGAACAA ATCTTGATGT GAACCTTGCT GTAAATAATT TACTTAGCCG GGATGATGAA 
721 GATGGAGATG ATGGGGATGA TACAGCCAGC GAATCTTATT TGGCTGGAGA GGATCTTATG 
1*} *! GCCGACAT TCATTCTGCC CACCCAAGTG TCATTATTGA TGCAGATGCC 

841 ATGTTTTCTG AAGACATTAG CTATTTTGGT TACCCTTCTT TTCGTCGTTC ATCACTTTCC 
901 AGGCTAGGCT CATCTCGAGT TCTCCTTCTT CCCTTAGAGA GAGACTCTGA GCTGTTGCGT 
961 GAACGCGAAT CCGTTTTACG TTTACGTGAA CGAAGGTGGC TTGATGGAGC CTCATTTGAT 
1021 AATGAAAGGG GTT CTACCAA GCAAGGAAGG AGAGCCAAAC TTGATAAGAA GAATACACCT 
all™™ CAGTATCTCT AGGAGAAGAT TT GCAGTGGT GGCCTGATAA GGAtSSaS 
1141 AAATTCATCT GTATGGCTCT GTATTCTGAA CTTCTGGCTG TCAGCAGTAA AGGAGAACTT 
1201 TATCAGTGGA AATGGAGTGA ATCTGAGCCT T ACAGAAAT G CCCAGAATCC TTCATTACAT 
1261 CATCCACGAG CAACATTTTT GGGGTTAACC AATGAAAAGA TAGTCCTCCT GTCTGCAAAT 
1321 AGCATAAGAG CAACTGTAGC TACAGAAAAG AACAAGGTTG CTACATGGGT GGATGAAACT 
1381 TTAAGTTCTG TGGCTTCTAA ATTAGAGCAC ACTGCTCAGA CTTACTCTGA ACTTCAAGGA 
1441 GAGCGGATAG TTT CTTTACA TTGCTGTGCC CTTTACACCT GCGCTCAGCT GGAAAACAGT 
■ ™S GGGGTGTAGT TCCTTTTAGT CAAAGGAAGA AAATGTTAGA GAAAGcSgA 
1561 GCAAAAAATA AAAAGCCTAA ATCCAGTGCT GGTATTTCTT CAATGCCGAA CATCACTGTT 
1621 GGTACCCAGG TATGCTTGAG AAATAATCCT CTTTATCATG CTGGAGCAGT TGCATTTTCA 
1681 ATTAGTGCTG GGATTCCTAA AGTTGGTGTC TTAATGGAGT CAGTTTGGAA TATGAATGAC 
llo\ Krvl^^l TTCAACTTAG ATCTCCTGAA AGCTTGAAAA ACATGGAAAA AGCTAGCAAA 
J2S ™^^ G CTAAGCCTGA AAGTAAGCAG GAGCCAGTGA AAACAGAAAT GGGTCCTCCA 
1861 CCATCTCCAG CATCCACGTG TAGTGATGCA TCCTCAATTG CCAGCAGTGC ATCAATGCCA 
1921 TACAAACGAC GACGGTCAAC CCCTGCACCA AAAGAAGAGG AAAAGGTGAA JgSgAGCaS 
1981 TGGTCTCTTC GGGAAGTGGT TTTTGTGGAA GATGTCAAGA ATGTTCCTGT TGGCAAGGTG 
2041 CTAAAAGTAG ATGGTGCCTA TGTTGCTGTA AAATTTCCAG GAACCTCCAG TAATACTAAC 
2101 TGTCAGAACA GCTCTGGTCC AGATGCTGAC CCTTCTTCTC TCCTGCAGGA T^TAgStA 
2161 CTTAGAATTG ATGAATTGCA GGTTGTCAAA ACTGGTGGAA CACCGAAGGT TCCCGACTGT 
2221 TTCCAAAGGA CTCCTAAAAA GCTTTGTATA CCTGAAAAAA CAGAAATATT AGCAGTGAAT 
2281 GTAGATTCCA AAGGTGTTCA TGCTGTTCTG AAGACTGGAA ATTGGGTGCG ATACTGTATC 
2341 TTTGATCTTG CTACAGGAAA AGCAGAACAG GAAAATAATT TTCCTACAAG CAGCATTGCT 
^ CC ZI GG ! C AGAATGAGAG GAATGTAGCC ATTTTCACTG CTGGACAGGA ATCTCCCATT 
24 61 ATTCTTCGAG ATGGAAATGG TACCATCTAC CCAATGGCCA AAGATTGCAT GGGAGGAATA 
AGGGATCCCG ATTGGCTGGA TCTTCCACCT ATTAGTAGTC TTGGAATGGG TGTGCATTCT 
"AATAAATC TTCCTGCCAA TT CAACAATC AAAAAGAAAG CTGCTGTTAT CATCATGGCT 
2641 GTAGAGAAAC AAACCTTAAT GCAACACATT CTGCGCTGTG ACTATGAGGC CTGTCGACAA 
2701 TATCTAATGA ATCTTGAGCA ACGGTTTTTA GAGCAGAATC TACAGATGCT GCAGaSSS 
2761 ATCAGCCACA GATGTGATGG AAATCGAAAT ATTTTGCATG CTTGtSJaS aSSSSS 
llll £SS5^ TAAAGAAGAA GAGGAAGCGG AGCGTTcJS XgAAAtIcA 

Ha] JZ^f^ GGCTTTCTGC TGTTGAGGCC ATTGCAAATG CAAT AT CAGT TGTTTCAAGT 
2941 AATGGCCCAG GTAATCGGGC TGGATCATCA AGTAGCCGAA GTTTGAGATT ACGGGAAATG 
3001 ATGAGACGTT CGTTGAGAGC AGCTGGTTTG GGTAGACATG AAGCTGGAGC TTCATCCAG? 
3061 GACCACCAGG ATCCAGTTTC ACCCCCCATA GCTCCCCCTA GTTGGGTTCC TGACCcSS 
3121 GCGATGGATC CTGATGGTGA CATTGATTTT ATCCTGGCCC CCGCTGTGGG AT^StaS 
3181 ACAGCAGCAA CCGGTACTGG TCAAGGACCA AGCACCTCCA CTATTCCAGG TcSSScA 
3241 GAGCCATCTG TAGTAGAATC CAAGGATCGA AAGGCGAAT G CTCATTTTAT aSgAAAtS 
3301 TTATGTGACA gtgtggttct CCAGCCCTAT CTACGAGAAC TTCTTTCTGC CAAGG^TGCA 
3361 AGAGGGATGA CCCCATTTAT GTCAGCTGTA AGTGGCCGAG CTTATCCTGC ScAATtS 

3421 atcttagaaa ctgctcagaa aattgcaaaa gctgaaatat cc?caaSS aa^SgaggS 
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FIGURE 3B continued 
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6661 
6721 
6781 



GAT GTATTCA 
GTTTTATGTT 
GATATTTTTG 
GCAGGGGTTT 
TGTGACTGTT 
CTTGATCTAC 
GGAGAGCACC 
CAATACAGGC 
TCAGATATGC 
GTTCTACAGG 
GACCCTCTTA 
AATCAGCAAA 
ACAGCAGATA 
AAATATACAC 
GTGGCAAGAG 
TTTATTCCAC 
GTGGAAGAAT 
CGTCCAACTG 
GAATTATTTT 
AGTCAGTCTC 
CAGCCCGTTC 
GTTGAGGTGG 
GGGGAAGAAA 
ATGGAGCTGG 
AAC CAAGAT A 
GCAGGAGCAA 
AGTGATTCTG 
GATGAGCCAT 
CGTTCAATGC 
TCCAGTACAT 
TTACGCCGGA 
AGCAACGCCA 
TTAGACAAAT 
CAGATTCCAG 
GAAGAAAAGC 
CAATTACGTT 
CTTCACGCTT 
TTACGAACAC 
TCTGCACGAG 
CATTCTGATG 
CAAGCACTTA 
CTAGAACGCA 
CAT GAAAAT G 
CTTCCTGCAG 
CTTGGGTGTA 
GATCAGCCAC 
CAGGGTCTTT 
AGAAACTGCC 
ATGAACATGC 
ACT GAGAGTT 
TTACTAGCAG 
CAGTGTAGCT 
TCTTTAGAAC 
CTAACTGAAT 
CTGAGAAACC 
ATTCAGCAGA 



TGGGAATGGT 
GTAAT GACAC 
AGTGTCGAAC 
GTCATAAAGG 
GGGAGAAATG 
TTTATCGCCT 
TCTTACTATT 
CACCTCGAAT 
CAGAT CAT GA 
ACTGGAATGC 
GTGCCAGCAG 
GTGGCACAAT 
TTTTGCTTTT 
CTGGACGTAG 
TTTTTGTTAT 
AGCCAATTGG 
TGTGCAACGT 
CACCATTTAC 
CAGTGGAACC 
AGTCATCCTA 
GGGGCAGAGA 
TGGAGGGTGT 
ATGCTGAGGC 
ACTTGTTAGC 
ATGCTAGTGG 
GCAGTGTTCC 
ATAGCAGTAG 
TAGAAAGAAC 
AGTGGGCTGT 
CTACACCAGC 
GTGGTACCAT 
GCAGTTACCT 
CATCGGACTT 
CAGCTGTGAA 
TTATTCCCAC 
ATGGTTCTGC 
CT CAGAATT C 
TTGAAGGCAG 
GAGACTTCCT 
TTCTTCCAGT 
TATACTGGAT 
AAAGGACGCG 
AT GAT GACAC 
AAACTGGCCA 
TACCCCCAAA 
ATCTGTTGCA 
ATTCTTCATC 
T AGAGGTT CT 
AAAACCGGCA 
CAAAACCAGG 
AAATAGGACT 
TTATGGGAAT 
TGTTCGGCAG 
TGGGTGGTTT 
AGCAGTCAAG 
CTATGAGGCA 



TTGCCCATCA 
TTGCAGTTTT 
TTGTGGCTTG 
TCATGATTGG 
T AAAT GT AAA 
GCTCACTGCT 
CTTAGTACAG 
CAGGGAAGAT 
TTTAGAGCCT 
CTTGAAATCT 
TAGAATAGGC 
TCGGCTGGAC 
AGATACTCTA 
AGAAGAAGCT 
TCTGAGTGTG 
AAAATGCAAG 
AGCAGAGTCA 
CCTGGCTAGT 
ACTGCCACCA 
CATCATCAGG 
TGAAGAACAG 
GGCTGGAGAA 
AGAGGGACAA 
AG GAG CAG AA 
GCGCAGAAGC 
TGCCTTCTTT 
TAGTCAGAGT 
CACAAATAGC 
CCGCAACACC 
AGCAAGTTCA 
CAGTACAAGT 
AACATCTGCA 
GATGGGCCTT 
ATTGACTTAC 
TTGGAACTGG 
ATTAGCATCT 
AGCGAGAAGA 
ACGACGTGCC 
AAATTATGCT 
TTTGGATGTT 
TAAGGCAATG 
AGAACTCTTG 
CAATCAAAGT 
AAACCATCCA 
TCCATTTGAA 
GCCAAATGCT 
TGCCAGTAGT 
TCCAACAAAA 
AAAAAAGAAG 
GCCATCTGCT 
TACTGAAAGT 
GGTTCTTTCC 
GGTATT CATG 
TGAGGTAAAA 
AGATTTGTCA 
GCTTAACAAT 



GGTACCAACC 
ACATGGACTG 
CTGGAGTCAC 
AAACTCAAAC 
ACTCTTATTG 
ACTAATCTGG 
ACAGT CGCAA 
CGTAACCGAA 
CCAAGATTTG 
ATGATTAT GT 
CATCTTTTGC 
TGTTTCACTC 
CTAGGTACAC 
ATTGCTGTGA 
GAAATGGCTT 
CGTGTATTCC 
CTGATTGTTC 
ACTAGCATAG 
C G AC CAT CAT 
AATC CACAGC 
GATGATATTG 
GAGGATCATC 
CAT GAT GAG C 
ACAGAAAGTG 
GTTGTCACTG 
T CT GAAGATG 
GACGACATAG 
TCCCATGCCA 
CTGCATCAGC 
GCGGGTTTGA 
GCTGCAGCTG 
AGCAGTTTAG 
ATT CCTAAGT 
CAAGATG CAG 
ATGGTCAGTA 
GCTGGTGATC 
GAGAGGATGA 
ACCTTGCTTA 
CTGTCTCTAA 
TGCTCATTGA 
AAT CAGCAGA 
GAACT GGGTA 
GCTACTTTGA 
TTTTTCCGAC 
GTGCCTCTGG 
AGAAAGGAGG 
GGGAAATGTT 
AT GTCTTAT G 
GGGAAGGAAC 
CATGATCTTG 
GAAGGGCCAC 
CATGATATGC 
GAAGATGTTG 
GAATCGAAAT 
CTAGAGGTTG 
CACTTTGGTC 



CTGATGACTC 
GAGCAGAGCA 
TGTGTTGTTG 
GGACATCACC 
CTGGACAGAA 
TTACTCTGCC 
GGCAGACGGT 
AAACAGCCAG 
CCCAGCTTGC 
TTGGGTCGCA 
CAGAAGAGCA 
ATTGCCTTAT 
TAGTGAAAGA 
CAATGAGGTT 
CATCCAAAAA 
AAGCATTGCT 
CT GT CAG AAT 
ATGCCATGCA 
CTGATCAGTC 
AGAGGCGCAT 
TTTCAGCAGA 
ATGATGAACA 
AT GATGAAGA 
ATAGT GAAAG 
CAGCAACTGC 
ATTCTCAATC 
AACAGGAGAC 
ATGGTGCTGC 
GAGCAGCCAG 
TTTATATTGA 
CAGCAGCTGC 
CCAGGGCTTA 
ATAAT CACCT 
TAAACTTACA 
TTATGGATTC 
CTGGACATCC 
CTGCGCGAGA 
GCGCCCGTCA 
TGCGGTCTCA 
AGCATGTGGC 
CAACATTGGA 
TTGATAATGA 
ATGATAAGGA 
GTTCAGACTC 
CTGAAGCCAT 
ATCTTTTTGG 
TAATGGAGGT 
CTGCCAATCT 
AGCCCGTGCT 
CTGCACAATT 
CTCTCACATC 
TGCTAGGACG 
GAGCAGAACC 
TCCGCAGAGA 
ATCGGGATCG 
GAAGATGTGC 



TCCTTTATAT 
CATT AAC CAG 
TACGGAATGT 
AACAGCCTAC 
ATCTGCTCGT 
AAACAGCAGG 
GGAGCATTGT 
TCCTGAAGAT 
ATTGGAGCGT 
GGAGAATAAA 
AGTATACCTC 
AGTTAAGTGT 
ACTCCAAAAC 
TCTACGTTCA 
GAAAAACAAC 
ACCTTACGCT 
GGGGATTGCT 
GGGCAGTGAA 
TAGCAGCTCC 
CAGCCAGTCA 
TGT GGAAGAG 
GGAAGAACAC 
CGGGAGTGAT 
TAACCACAGC 
TGGTTCAGAA 
GAATGACTCA 
CTTTATGCTT 
CCAAGCTCCC 
TACAGCCCCT 
TCCTTCAAAC 
TTTGGAAGCT 
CAGCATGTCA 
AGTATACTCT 
GAACTATGTA 
TACTGAAGCT 
AAAT CAT CCT 
AGAAGCTAGC 
AGGAATGATG 
TAAT GAT GAG 
ATATGTTTTT 
TACACCTCAA 
AGATTCAGAA 
TGATGACTCT 
CAT GACATT C 
CCCCTTGGCT 
CCGTCCAAGT 
TACAGTGGAT 
GAAAAATGTA 
GCCAGAAGAA 
AAAAAGTAGC 
TTTCAGGCCA 
TTGGCGCCTT 
TGGATCAATC • 
AATGGAAAAA 
AGATCTTCTC 
TACTATACCA 
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FIGURE 3B coiitimued 



6841 ATGGCTGTAC ACAGAGTAAA AGTCACATTT AAGGATGAGC CAGGAGAGGG CAGTGGTGTA 
6901 GCACGAAGTT TTTATACAGC CATTGCACAA GCATTTTTAT CAAATGAAAA ATTGCCAAAT 
6961 CTAGAGT GTA TCCAAAATGC CAACAAAGGC ACCCACACAA GTTTAATGCA GAGATTAAGG 
7021 AACCGAGGAG AGAGAGACCG GGAAAGGGAG AGAGAAAGGG AAAT GAGGAG GAGTAGTGGT 
7081 TTGCGAGCAG GTTCTCGGAG GGACCGGGAT AGAGACTTTA GAAGACAGCT TTCCATCGAC 
7141 ACTAGGCCCT TTAGACCAGC CTCTGAAGGG AATCCTAGCG ATGATCCTGA GCCTTTGCCA 
7201 GCACATCGGC AGGCACTTGG AGAGAGGCTT TATCCTCGTG TACAAGCAAT GCAACCAGCA 
7261 TTTGCAAGTA AAATCACTGG CATGTTGTTG GATTATCCCA GCTCAGCTGC TTCTCTTCTA 
7321 GCAAGTGAGG ATTCTCTGAG AGCAAGAGTG GATGAGGCCA TGGAACTCAT TATTGCACAT 
7381 GGACGGGAAA ATGGAGCTGA TAGTATCCTG GATCTTGGAT TAGTAGACTC CTCAGAAAAG 
7441 GTACAGCAGG AAAACCGAAA GCGCCATGGC TCTAGTCGAA GTGTAGTAGA TATGGATTTA 
7501 GATGATACAG ATGATGGTGA TGACAATGCC CCTTTGTTTT ACCAACCTGG GAAA^GAGGA 
7561 TTTTATACTC CAAGGCCTGG CAAGAACACA GAAGCAAGGT TGAATTGTTT CAGAAACATT 
7621 GGCAGGATTC TTGGACTATG TCTGTTACAG AATGAACTCT GTCCTATCAC ATTGAATAGA 
7681 CATGTAATTA AAGTATTGCT TGGTAGAAAA GTCAATTGGC ATGATTTTGC TTTTTTTGAT 
7741 CCTGTAATGT ATGAGAGTTT GCGGCAACTA ATCCTCGCGT CTCAGAGTTC AGATGCT GAT 
7 801 GCTGTTTTCT CAGCAATGGA TTTGGCATTT GCAATTGACC TGT GTAAAGA AGAAGGTGGA 
7861 GGACAGGTTG AACTCATTCC TAATGGTGTA AAGAGACCAG TCACTCCACA GAAT GTATAT 
7921 GAGTATGTGC GGAAAGACGC AGAACACAGA ATGTTGGTAG TTGCAGAACA GCCCTTACAT 
7981 GCAATGAGGA AAGGTCTACT AGATGTGCTT CCAAAAAATT CATTAGAAGA TTTAACGGCA 
8041 GAAGATTTTA GGCTTTTGGT AAATGGCTGC GGTGAAGTCA ATGTGCAAAT GCTGATCAGT 
8101 TTTACCTCTT T CAATGAT GA ATCAGGAGAA AATGCTGAGA AGCTTCTGCA GTTCAAGCGT 
8161 TGGTTCTGGT CAATAGTAGA GAAGATGAGC AT GACAG AAC GACAAGATCT TGTTTACTTT 
8221 TGGACATCAA GCCCATCACT GCCAGCCAGT GAAGAAGGAT TCCAGCCTAT GCCCTCAATC 
8281 ACAATAAGAC C AC CAGAT GA CCAACATCTT CCTACTGCAA ATACTTGCAT TTCTCGACTT 
8341 TACGTCCCAC TCTATTCCTC TAAACAGATT CTCAAACAGA AATTGTTACT CGCCATTAAG 
8401 ACCAAGAATT TTGGTTTTGT GTAGAGTATA AAAAGTGTGT ATTGCTGTGT AATATTACTA 
84 61 GCAAATTTTG TAGATTTTTT TCCATTTGTC TAT 
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FIGURE 3C 

1 MTSIHFWHP LPGTEDQLND RLREVSEKLN KYNLNSHPPL NVLEQATIKQ CWGPNHAAF 
61 LLEDGRVCRI GFSVQPDRLE LGKPDNNDGS KLNSNSGAGR TSRPGRTSDS PWFLSGSETL 
121 GRLAGNTLGS RWSSGVGGSG GGSSGRSSAG ARDSRRQTRV IRTGRDRGSG LLGSQPQPVT 
181 PASVIPEELI SQAQWLQGK SRSVIIRELQ RTNLDVNLAV NNLLSRDDED GDOGDDTASE 
241 SYLAGEDLMS LLDADIHSAH PSVIIDADAM FSEDISYFGY PSFRRSSLSR LGSSRVLLLP 
301 LERDSELLRE RESVLRLRER RWLDGASFDN ERGSTSKEGE PNLDKKNTPV QSPVSLGEDL 
361 QWWPDKDGTK FICIGALYSE LLAVSSKGEL YQWKWSESEP YRNAQNPSLH HPRATFLGLT 
421 NEKIVLLSAN SIRATVATEN NKVATWVDET LSSVASKL EH TAQTYSELQG ERIVSLHCCA 
481 LYTCAQLENS LYWWGWPFS QjRKKMLEKAR AKNKKPKfc SA GISSMPNITV GTQVCLRNNP 
541 LYHAGAVAFS ISAGIPKVGV LMESVWNMND SCRFQLRSPE SLKNMEKASK TTEAKPESKQ 
601 EPVKTEMGPP PSPASTCSDA SSIASSASMP YKRRRSTPAP KEEEKVNEEQ WSLREWFVE 
661 DVKNVPVGKV LKVDGAYVAV KFPGTSSNTN CQNSSGPDAD PSSLLQDCRL LRIDELQWK 
721 TGGTPKVPDC FQRTPKKLCI PEKTEILAVN VDSKGVHAVL KTGNWVRYCI FDLATGKAEQ 
781 ENNFPTSSIA FLGQNERNVA IFTAGQESPI ILRDGNGTIY PMAKDCMGGI RDPDWLDLPP 
841 ISSLGMGVHS LINLPANSTI KKKAAVT IMA VEKQTLMQHI LRCDYEACRQ YLMNLEQAW 
901 LEQNLQMLQT FISHRCDGNR NILHACVSVC FPTSNKETKE EEEAERSERN TFAERLSAVE 
961 AIANAISWS SNGPGNRAGS SSSRSLRLRE MMRRS LRAAG LGRHEAGASS SDHQDPVSPP 
1021 IAPPSWVPDP PAMDPDGDID FILAPAVGSL TTAATGTGQG PSTSTIPGPS TEPSWESKD 
1081 RKANAHFILK LLCDSWLQP YLRELLSAKD ARGMTPFMSA VSGRAYPAAI TILETAQKIA 
1141 KAEISSSEKE ED VFMGMV C P SGTNPDPSPL YVLCCNDTCS FTWTGAEHIN QDIFECRTCG 
1201 LLES LCCCTE CARVCHKGHD QKLKRTS PTA YCDCWEKCKC Kj ri*IAGQKSA RLDLLYRLLT 
12 61 ATNLVTLPNS RGEHI*I*LFI*V QTVARQTVEH CQYRPPRIRE DRNRKTASPE DSDMPDHDLE 
1321 PPRFAQIALE RVLQDWNALK SMIMFGSQEN KDPLSASSRI GHLLPEEQVY LNQQSGTIRL 
1381 DCFTHCLIVK CTADIT.T.T.DT LLGTLVKELQ NKY TPGRREE AIAVTMRFLR SVARVFVILS 
1441 VEMASSKKKN NFIPQPIGKC KRVFQAI*LPY AVEELCNVAE SLIVPVRMGI ARPTAPFTLA 
1501 STSIDAMQGS EELFSVEPLP PRPSSDQSSS SSQSQSSYII RNPQQRRISQ SQPVRGRDEE 
1561 QDDIVSADVE EVEWEGVAG EEDHHDEQEE HGEENAEAEG QHDEHDEDGS DMELDLLAAA 
1621 ETESDSESNH SNQDNASGRR SWTAATAGS EAGASSVPAF FSEDDSQSND SSDSDSSSSQ 
1681 SDDIEQETFM LDEPLERTTN SSHANGAAQA PRSMQWAVRN TQHQRAASTA PSSTSTPAAS 
1741 SAGLIYIDPS NLRRSGTIST SAAAAAAALE AS HAS S YLTS ASSLARAYSI VIRQISDLMG 
1801 LIPKYNHLVY SQI PAAVKLT YQDAVNLQNY VEEKLI PTWN WMVSIMDSTE AQLRYGSA1A 
1861 SAGDPGHPNH PLHASQNSAR RERMTAREEA SLRTLEGRRR ATLLSARQGM MSARGDFLNY 
1921 ALSLMRSHND EHSDVLPVLD VCSLKHVAYV FQALIYWIKA MNQQTTLDTP QLERKRTREL 
1981 LELGIDNEDS EHENDDDTNQ SATLNDKDDD SLPAETGQNH PFFRRSDSMT FLGCIPPNPF 

2 041 EVPLAEAIPL ADQPHLLQPN ARKEDLFGRP SQGLYSSSAS SGKCLMEVTV DRNCLEVLPT 
2101 KMSYAANLKN VMNMQNRQKK EGEEQPVLPE ETESSKPGPS AHDLAAQLKS SLLAEIGLTE 
2161 SEGPPLTSFR PQCSFMGMVI SHDMLLGRWR LS LEI*FGRVF MEDVGAE PGS ILTELGGFEV 
2221 KESKFRREME KLRNQQSRDL SLEVDRDRDL LIQQTMRQLN NHFGRRCATT PMAVHRVKVT 
2281 FKDEPGEGS G VARSFYTAIA QAFLSNEKLP NLEC I QNAN K GTHTS LMQRL RNRGERDRER 
2341 EREREMRRSS GLRAGS RRDR DRDFRRQLSI DTRPFRPASE GNPSDDPEPL PAH RQAL GER 
2401 LYPRVQAMQP AFASKITGML LELSPAQLLL LLASEDSLRA RVDEAMELU AHGRENGADS 
2461 ILDLGLV DSS EKVQQENRKR HGSS RSWDM DLDDTDDGDD NAPLFYQPGK RGFY TPRPGK 
2521 NTEARLNCFR NIGRILGLCL LQNELCPITL NRHVIKVLLG RKVNWHDFAF FDFVMYESLR 
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FIGURE 4 
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FIGURE 5C 
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FIGURE 6 
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FIGURE 8A 
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FIGURE 8B 
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